Demuxlet¶

This workflow runs demuxlet to deconvolute sample identity when multiple samples are pooled by barcoded single-cell sequencing.

Align your single-cell sequencing data (for example using the cellranger or drop_seq workflows).

Create a sample sheet.

Please note that the columns in the tab separated file must be in the order shown below and does not contain a header line.

Column Description

Name Sample name.

BAM Location of the BAM file in the cloud (gs:// URL).

Barcodes Location of the valid cellular barcodes file in the cloud (gs:// URL).

VCF Location of the VCF file to use for this sample in the cloud (gs:// URL).

Example:
sample-1,gs://fc-e0000000/sample-1/out/possorted_genome_bam.bam,gs://fc-e0000000/sample-1/out/filtered_feature_bc_matrix/barcodes.tsv.gz,gs://fc-e0000000/sample-1.vcf
sample-2,gs://fc-e0000000/sample-2/out/possorted_genome_bam.bam,gs://fc-e0000000/sample-2/out/filtered_feature_bc_matrix/barcodes.tsv.gz,gs://fc-e0000000/sample-2.vcf

Upload your sample sheet to the workspace bucket.

Example:

gsutil cp /foo/bar/projects/sample_sheet.tsv gs://fc-e0000000/

Import demuxlet workflow to your workspace.

See the Terra documentation for adding a workflow. The workflow is under Broad Methods Repository with the name “cumulus/demuxlet”.

Next, in the workflow page, click the Export to Workspace... button, and select the workspace you want to export to in the drop-down menu.
In your workspace, open demuxlet in WORKFLOWS tab. Select Run workflow with inputs defined by file paths as below

and click the Save button.

Inputs¶

Please see the description of important inputs below.

Column	Description
tsv_file	Four column tab-separated file without a header with name, coordinate sorted bam, barcodes, and vcf
min_MQ	Minimum mapping quality to consider (default 20)
alpha	Grid of alpha to search for (default [0.1, 0.2, 0.3, 0.4, 0.5]).
min_TD	Minimum distance to the tail (default 0)
tag_group	Tag representing readgroup or cell barcodes, in the case to partition the BAM file into multiple groups (default “CB”)
tag_UMI	Tag representing UMIs (default “UB”“)
field	FORMAT field to extract the genotype, likelihood, or posterior from (default “GT”)
geno_error	Offset of genotype error rate (default 0.1)

Outputs¶

The demuxlet output file contains the best guess of the sample identity, with detailed statistics to reach to the best guess.