Fit Transition Path Theory (TPT) on the cellrank transition models
fitTPT.RdfitTPT fits Transition Path Theory on the transition model defined using fitTransitionModel(), at
a 'coarse-grained' level where transitions are considered between *groups* of cells with grouping indicated by the user.
Usage
fitTPT(
anndata_file,
CellrankObj,
group.cells.by,
source_state,
target_state,
conda_env = "scicsr",
random_n = 100,
do_pca = TRUE,
do_neighbors = TRUE
)Arguments
- anndata_file
filename pointing to the AnnData file.
- CellrankObj
the output of
fitTransitionModel().- group.cells.by
character, column in the metadata to group cells
- source_state
character, a value in the
group.cells.bycolumn which is taken as the source state for fitting transition path theory. All cells belonging to this group are considered as the source.- target_state
character, a value in the
group.cells.bycolumn which is taken as the target state for fitting transition path theory. All cells belonging to this group are considered as the target.- conda_env
character, if not
NULLthis named conda environment is used to perform TPT analysis. (Default:NULL, i.e. no conda environment will be used, the program assumes the python packagesscanpy,scveloandcellrankare installed in the local python)- random_n
number of times to reshuffle transition matrix columns to derive randomised models (default: 100).
- do_pca
Should principal component analysis (PCA) be re-computed on the data? (Default: TRUE)
- do_neighbors
Should k-nearest neighbour (kNN) graph be re-computed on the data? (Default: TRUE)
Value
a list with these entries:
- gross_flux
a n-by-n matrix (where n is the total number of states), of total fluxes estimated between from a state (row) to another state (column).
- pathways
a data.frame indicating the possible paths to take from
source_statetotarget_state, and the likelihood (max: 100) to travel through each stated path.- significance
a n-by-n matrix (where n is the total number of states), where the observed gross flux is greater than the flux estimated in the randomised models.
- total_gross_flux
element-wise sum of the gross_flux matrix.
- total_gross_flux_reshuffled
element-wise sum of the gross_flux matrix, calculated over each randomised (randomly reshuffled transition matrix coluns) models.
- gross_flux_randomised
gross_flux matrix but from the randomised (randomly reshuffled transition matrix coluns) TPT models.
- mfpt
Mean First Passage Time required to travel from
source_statetotarget_stateas estimated by Transition Path Theory.- mfpt_reshuffled
Mean First Passage Time required to travel from
source_statetotarget_stateas estimated by Transition Path Theory, calculated over each randomised (randomly reshuffled transition matrix coluns) models.- stationary_distribution
Equilibrium probability of each state as estimated by Transition Path Theory.
- stationary_distribution_reshuffled
Equilibrium probability of each state as estimated by Transition Path Theory, calculated over each randomised (randomly reshuffled transition matrix coluns) models.
Details
fitTPT interfaces with (and reimplements some routines to improve efficincy) the Python deeptime package to fit transition path theory (TPT) onto the markov state model defined by running
the fitTransitionModel function that uses cellrank under the hood. With the parameter group.cells.by, the user specifies
a scheme to group individual row/columns of the transition matrix (for example, by cell type or by isotype). The function
then fits TPT on to this grouped/'coarse-grained' transition matrix, upon user indicating a likely 'source' and 'target' state.
The output are estimated information flows ('flux') between different states in order to flow from the source to the target,
and the probabilities of sampling each state at equilibrium ('stationary distribution').
A random 'null background' model was fitted by randomly reshuffling columns of the transition matrix by random_n (default: 100) times.
These random fluxes help determine the significance of an observed flux, by calculating one-sided empirical probabilities of the observed flux larger than the that observed in the randomised models.