Tools for working with genomic and high throughput sequencing data.
The following tools are available in fgbio version 3.1.0-09e13bc-SNAPSHOT.
Tools for manipulating basecalling data.
| Tool | Description |
|---|---|
| ExtractBasecallingParamsForPicard | Extracts sample and library information from an sample sheet for a given lane |
| ExtractIlluminaRunInfo | Extracts information about an Illumina sequencing run from the RunInfo |
Tools for manipulating FASTA files.
| Tool | Description |
|---|---|
| CollectAlternateContigNames | Collates the alternate contig names from an NCBI assembly report |
| HardMaskFasta | Converts soft-masked sequence to hard-masked in a FASTA file |
| SortSequenceDictionary | Sorts a sequence dictionary file in the order of another sequence dictionary |
| UpdateFastaContigNames | Updates the sequence names in a FASTA |
| UpdateIntervalListContigNames | Updates the sequence names in an Interval List file |
Tools for manipulating FASTQ files.
| Tool | Description |
|---|---|
| DemuxFastqs | Performs sample demultiplexing on FASTQs |
| FastqToBam | Generates an unmapped BAM (or SAM or CRAM) file from fastq files |
| SortFastq | Sorts a FASTQ file |
| TrimFastq | Trims reads in one or more line-matched fastq files to a specific read length |
Tools for RNA-Seq data
| Tool | Description |
|---|---|
| CollectErccMetrics | Collects metrics for ERCC spike-ins for RNA-Seq experiments |
| EstimateRnaSeqInsertSize | Computes the insert size for RNA-Seq experiments |
Tools for manipulating SAM, BAM, or related data.
| Tool | Description |
|---|---|
| AnnotateBamWithUmis | Annotates existing BAM files with UMIs (Unique Molecular Indices, aka Molecular IDs, Molecular barcodes) from separate FASTQ files |
| AssignPrimers | Assigns reads to primers post-alignment |
| AutoGenerateReadGroupsByName | Adds read groups to a BAM file for a single sample by parsing the read names |
| CallOverlappingConsensusBases | Consensus calls overlapping bases in read pairs |
| ClipBam | Clips reads from the same template |
| DownsampleAndNormalizeBam | Downsamples a BAM in a biased way to a uniform coverage across regions |
| ErrorRateByReadPosition | Calculates the error rate by read position on coordinate sorted mapped BAMs |
| EstimatePoolingFractions | Examines sequence data generated from a pooled sample and estimates the fraction of sequence data coming from each constituent sample |
| ExtractUmisFromBam | Extracts unique molecular indexes from reads in a BAM file into tags |
| FilterBam | Filters reads out of a BAM file |
| FindSwitchbackReads | Finds reads where a template switch occurred during library construction |
| FindTechnicalReads | Find reads that are from technical or synthetic sequences in a BAM file |
| RandomizeBam | Randomizes the order of reads in a SAM or BAM file |
| RemoveSamTags | Removes SAM tags from a SAM or BAM file |
| SetMateInformation | Adds and/or fixes mate information on paired-end reads |
| SortBam | Sorts a SAM or BAM file |
| SplitBam | Splits a BAM into multiple BAMs, one per-read group (or library) |
| TrimPrimers | Trims primers from reads post-alignment |
| UpdateReadGroups | Updates one or more read groups and their identifiers |
| ZipperBams | Zips together an unmapped and mapped BAM to transfer metadata into the output BAM |
Tools for manipulating UMIs & reads tagged with UMIs
| Tool | Description |
|---|---|
| CallCodecConsensusReads | Calls consensus sequences from reads generated from the the CODEC protocol |
| CallDuplexConsensusReads | Calls duplex consensus sequences from reads generated from the same double-stranded source molecule |
| CallMolecularConsensusReads | Calls consensus sequences from reads with the same unique molecular tag |
| CollectDuplexSeqMetrics | Collects a suite of metrics to QC duplex sequencing data |
| CopyUmiFromReadName | Copies the UMI at the end of the BAM’s read name to the RX tag |
| CorrectUmis | Corrects UMIs stored in BAM files when a set of fixed UMIs is in use |
| FilterConsensusReads | Filters consensus reads generated by CallMolecularConsensusReads or CallDuplexConsensusReads |
| GroupReadsByUmi | Groups reads together that appear to have come from the same original molecule |
| ReviewConsensusVariants | Extracts data to make reviewing of variant calls from consensus reads easier |
Various utility programs.
| Tool | Description |
|---|---|
| PickIlluminaIndices | Picks a set of molecular indices that should work well together |
| PickLongIndices | Picks a set of molecular indices that have at least a given number of mismatches between them |
| UpdateDelimitedFileContigNames | Updates the contig names in columns of a delimited data file (e |
| UpdateGffContigNames | Updates the contig names in a GFF |
Tools for manipulating VCF, BCF, or related data.
| Tool | Description |
|---|---|
| AssessPhasing | Assess the accuracy of phasing for a set of variants |
| DownsampleVcf | Re-genotypes a VCF after downsampling the allele counts |
| FilterSomaticVcf | Applies one or more filters to a VCF of somatic variants |
| FixVcfPhaseSet | Adds/fixes the phase set (PS) genotype field |
| HapCutToVcf | Converts the output of ‘HAPCUT’ (‘HapCut1’/’HapCut2’) to a VCF |
| MakeMixtureVcf | Creates a VCF with one sample whose genotypes are a mixture of other samples’ |
| MakeTwoSampleMixtureVcf | Creates a simulated tumor or tumor/normal VCF by in-silico mixing genotypes from two samples |
| UpdateVcfContigNames | Updates then contig names in a VCF |