Tools for working with genomic and high throughput sequencing data.
The following tools are available in fgbio version 2.4.0.
Tools for manipulating basecalling data.
Tool | Description |
---|---|
ExtractBasecallingParamsForPicard | Extracts sample and library information from an sample sheet for a given lane |
ExtractIlluminaRunInfo | Extracts information about an Illumina sequencing run from the RunInfo |
Tools for manipulating FASTA files.
Tool | Description |
---|---|
CollectAlternateContigNames | Collates the alternate contig names from an NCBI assembly report |
HardMaskFasta | Converts soft-masked sequence to hard-masked in a FASTA file |
SortSequenceDictionary | Sorts a sequence dictionary file in the order of another sequence dictionary |
UpdateFastaContigNames | Updates the sequence names in a FASTA |
UpdateIntervalListContigNames | Updates the sequence names in an Interval List file |
Tools for manipulating FASTQ files.
Tool | Description |
---|---|
DemuxFastqs | Performs sample demultiplexing on FASTQs |
FastqToBam | Generates an unmapped BAM (or SAM or CRAM) file from fastq files |
SortFastq | Sorts a FASTQ file |
TrimFastq | Trims reads in one or more line-matched fastq files to a specific read length |
Tools for RNA-Seq data
Tool | Description |
---|---|
CollectErccMetrics | Collects metrics for ERCC spike-ins for RNA-Seq experiments |
EstimateRnaSeqInsertSize | Computes the insert size for RNA-Seq experiments |
Tools for manipulating SAM, BAM, or related data.
Tool | Description |
---|---|
AnnotateBamWithUmis | Annotates existing BAM files with UMIs (Unique Molecular Indices, aka Molecular IDs, Molecular barcodes) from separate FASTQ files |
AssignPrimers | Assigns reads to primers post-alignment |
AutoGenerateReadGroupsByName | Adds read groups to a BAM file for a single sample by parsing the read names |
CallOverlappingConsensusBases | Consensus calls overlapping bases in read pairs |
ClipBam | Clips reads from the same template |
DownsampleAndNormalizeBam | Downsamples a BAM in a biased way to a uniform coverage across regions |
ErrorRateByReadPosition | Calculates the error rate by read position on coordinate sorted mapped BAMs |
EstimatePoolingFractions | Examines sequence data generated from a pooled sample and estimates the fraction of sequence data coming from each constituent sample |
ExtractUmisFromBam | Extracts unique molecular indexes from reads in a BAM file into tags |
FilterBam | Filters reads out of a BAM file |
FindSwitchbackReads | Finds reads where a template switch occurred during library construction |
FindTechnicalReads | Find reads that are from technical or synthetic sequences in a BAM file |
RandomizeBam | Randomizes the order of reads in a SAM or BAM file |
RemoveSamTags | Removes SAM tags from a SAM or BAM file |
SetMateInformation | Adds and/or fixes mate information on paired-end reads |
SortBam | Sorts a SAM or BAM file |
SplitBam | Splits a BAM into multiple BAMs, one per-read group (or library) |
TrimPrimers | Trims primers from reads post-alignment |
UpdateReadGroups | Updates one or more read groups and their identifiers |
ZipperBams | Zips together an unmapped and mapped BAM to transfer metadata into the output BAM |
Tools for manipulating UMIs & reads tagged with UMIs
Tool | Description |
---|---|
CallDuplexConsensusReads | Calls duplex consensus sequences from reads generated from the same double-stranded source molecule |
CallMolecularConsensusReads | Calls consensus sequences from reads with the same unique molecular tag |
CollectDuplexSeqMetrics | Collects a suite of metrics to QC duplex sequencing data |
CopyUmiFromReadName | Copies the UMI at the end of the BAM’s read name to the RX tag |
CorrectUmis | Corrects UMIs stored in BAM files when a set of fixed UMIs is in use |
FilterConsensusReads | Filters consensus reads generated by CallMolecularConsensusReads or CallDuplexConsensusReads |
GroupReadsByUmi | Groups reads together that appear to have come from the same original molecule |
ReviewConsensusVariants | Extracts data to make reviewing of variant calls from consensus reads easier |
Various utility programs.
Tool | Description |
---|---|
PickIlluminaIndices | Picks a set of molecular indices that should work well together |
PickLongIndices | Picks a set of molecular indices that have at least a given number of mismatches between them |
UpdateDelimitedFileContigNames | Updates the contig names in columns of a delimited data file (e |
UpdateGffContigNames | Updates the contig names in a GFF |
Tools for manipulating VCF, BCF, or related data.
Tool | Description |
---|---|
AssessPhasing | Assess the accuracy of phasing for a set of variants |
FilterSomaticVcf | Applies one or more filters to a VCF of somatic variants |
FixVcfPhaseSet | Adds/fixes the phase set (PS) genotype field |
HapCutToVcf | Converts the output of ‘HAPCUT’ (‘HapCut1’/’HapCut2’) to a VCF |
MakeMixtureVcf | Creates a VCF with one sample whose genotypes are a mixture of other samples’ |
MakeTwoSampleMixtureVcf | Creates a simulated tumor or tumor/normal VCF by in-silico mixing genotypes from two samples |
UpdateVcfContigNames | Updates then contig names in a VCF |