Skip to contents

guessBarcodes parses given cell identifiers to identify the substring which correspond to the nucleotide barcode included in the experiment.

Usage

guessBarcodes(cell_name, min_barcode_length = 6L)

Arguments

cell_name

character, a cell identifier, typicall with prefix and/or suffix (e.g. "ACTGATGCAT-1", "SampleA_ATGAACCTATGG")

min_barcode_length

integer, minimum length of the nucleotide barcode (Default: 6)

Value

a vector with the input cell_name decomposed into these three entries:

prefix

prefix which exists in the input cell_name (NA if doesn't exist in cell_name)

cell_name

the actual nucleotide barcode

suffix

suffix which exists in the input cell_name (NA if doesn't exist in cell_name)

Details

Numeric / string prefices/suffices were typically added to cell identifiers to avoid wrong mapping across samples; however often these create issues when trying to merge data on the *same* sample but annotated using different workflows. This function attempts to resolve such issues by extracting the nucleotide barcodes actually introduced in the experiment.