-b Output in the BAM format.
|
-f INT Only output alignments with all bits in INT present in the FLAG field. INT can be in
hex in the format of /^0x[0-9A-F]+/ [0]
|
-F INT Skip alignments with bits present in INT [0]
|
-h Include the header in the output.
|
-H Output the header only.
|
-l STR Only output reads in library STR [null]
|
-o FILE Output file [stdout]
|
-q INT Skip alignments with MAPQ smaller than INT [0]
|
-r STR Only output reads in read group STR [null]
-R FILE Output reads in read groups listed in FILE [null]
|
-S Input is in SAM. If @SQ header lines are absent, the `-t' option is required.
|
-c Instead of printing the alignments, only count them and print the total number. All
filter options, such as `-f', `-F' and `-q' , are taken into account.
-t FILE This file is TAB-delimited. Each line must contain the reference name and the length of
the reference, one line for each distinct reference; additional fields are ignored.
This file also defines the order of the reference sequences in sorting. If you run
`samtools faidx <ref.fa>', the resultant index file <ref.fa>.fai can be used as this
<in.ref_list> file.
|
-u Output uncompressed BAM. This option saves time spent on compression/decomprssion and
is thus preferred when the output is piped to another samtools command.
|
-6 Assume the quality is in the Illumina 1.3+ encoding. -A Do not skip anomalous read
pairs in variant calling.
|
-B Disable probabilistic realignment for the computation of base alignment quality
(BAQ). BAQ is the Phred-scaled probability of a read base being misaligned. Applying
this option greatly helps to reduce false SNPs caused by misalignments.
|
-b FILE List of input BAM files, one file per line [null]
|
-C INT Coefficient for downgrading mapping quality for reads containing excessive
mismatches. Given a read with a phred-scaled probability q of being generated from
the mapped position, the new mapping quality is about sqrt((INT-q)/INT)*INT. A zero
value disables this functionality; if enabled, the recommended value for BWA is 50.
[0]
|
-d INT At a position, read maximally INT reads per input BAM. [250]
|
-E Extended BAQ computation. This option helps sensitivity especially for MNPs, but may
hurt specificity a little bit.
|
-f FILE The faidx-indexed reference file in the FASTA format. The file can be optionally
compressed by razip. [null]
|
-l FILE BED or position list file containing a list of regions or sites where pileup or BCF
should be generated [null]
|
-q INT Minimum mapping quality for an alignment to be used [0]
|
-Q INT Minimum base quality for a base to be considered [13]
|
-r STR Only generate pileup in region STR [all sites]
|
-D Output per-sample read depth
|
-g Compute genotype likelihoods and output them in the binary call format (BCF).
|
-S Output per-sample Phred-scaled strand bias P-value
|
-u Similar to -g except that the output is uncompressed BCF, which is preferred for
piping.
|
-e INT Phred-scaled gap extension sequencing error probability. Reducing INT leads to longer
indels. [20]
|
-h INT Coefficient for modeling homopolymer errors. Given an l-long homopolymer run, the
sequencing error of an indel of size s is modeled as INT*s/l. [100]
|
-I Do not perform INDEL calling
|
-L INT Skip INDEL calling if the average per-sample depth is above INT. [250]
|
-o INT Phred-scaled gap open sequencing error probability. Reducing INT leads to more indel
calls. [40]
|
-P STR Comma dilimited list of platforms (determined by @RG-PL) from which indel candidates
are obtained. It is recommended to collect indel candidates from sequencing
technologies that have low indel error rate such as ILLUMINA. [all]
|
-o Output the final alignment to the standard output.
|
-n Sort by read names rather than by chromosomal coordinates
|
-m INT Approximately the maximum required memory. [500000000]
merge samtools merge [-nur1f] [-h inh.sam] [-R reg] <out.bam> <in1.bam> <in2.bam> [...]
Merge multiple sorted alignments. The header reference lists of all the input BAM files, and
the @SQ headers of inh.sam, if any, must all refer to the same set of reference sequences. The
header reference list and (unless overridden by -h) `@' headers of in1.bam will be copied to
out.bam, and the headers of other files will be ignored.
|
-1 Use zlib compression level 1 to comrpess the output
|
-f Force to overwrite the output file if present.
-h FILE Use the lines of FILE as `@' headers to be copied to out.bam, replacing any header
lines that would otherwise be copied from in1.bam. (FILE is actually in SAM format,
though any alignment records it may contain are ignored.)
|
-n The input alignments are sorted by read names rather than by chromosomal coordinates
|
-R STR Merge files in the specified region indicated by STR [null]
|
-r Attach an RG tag to each alignment. The tag value is inferred from file names.
|
-u Uncompressed BAM output
|
-s Remove duplicate for single-end reads. By default, the command works for paired-end
reads only.
|
-S Treat paired-end reads and single-end reads.
calmd samtools calmd [-EeubSr] [-C capQcoef] <aln.bam> <ref.fasta>
Generate the MD tag. If the MD tag is already present, this command will give a warning if the
MD tag generated is different from the existing tag. Output SAM by default.
|
-A When used jointly with -r this option overwrites the original base quality.
|
-e Convert a the read base to = if it is identical to the aligned reference base. Indel
caller does not support the = bases at the moment.
|
-u Output uncompressed BAM
|
-b Output compressed BAM
|
-S The input is SAM with header lines
|
-C INT Coefficient to cap mapping quality of poorly mapped reads. See the pileup command for
details. [0]
|
-r Compute the BQ tag (without -A) or cap base quality by BAQ (with -A).
|
-E Extended BAQ calculation. This option trades specificity for sensitivity, though the
effect is minor.
targetcut samtools targetcut [-Q minBaseQ] [-i inPenalty] [-0 em0] [-1 em1] [-2 em2] [-f ref] <in.bam>
This command identifies target regions by examining the continuity of read depth, computes
haploid consensus sequences of targets and outputs a SAM with each sequence corresponding to a
target. When option -f is in use, BAQ will be applied. This command is only designed for
cutting fosmid clones from fosmid pool sequencing [Ref. Kitzman et al. (2010)].
phase samtools phase [-AF] [-k len] [-b prefix] [-q minLOD] [-Q minBaseQ] <in.bam>
Call and phase heterozygous SNPs. OPTIONS:
|
-A Drop reads with ambiguous phase.
|
-b STR Prefix of BAM output. When this option is in use, phase-0 reads will be saved in file
STR.0.bam and phase-1 reads in STR.1.bam. Phase unknown reads will be randomly
allocated to one of the two files. Chimeric reads with switch errors will be saved in
STR.chimeric.bam. [null]
|
-F Do not attempt to fix chimeric reads.
|
-k INT Maximum length for local phasing. [13]
|
-q INT Minimum Phred-scaled LOD to call a heterozygote. [40]
|
-Q INT Minimum base quality to be used in het calling. [13]
|
-A Retain all possible alternate alleles at variant sites. By default, the view command
discards unlikely alleles.
|
-b Output in the BCF format. The default is VCF.
|
-D FILE Sequence dictionary (list of chromosome names) for VCF->BCF conversion [null]
|
-F Indicate PL is generated by r921 or before (ordering is different).
|
-G Suppress all individual genotype information.
|
-l FILE List of sites at which information are outputted [all sites]
|
-N Skip sites where the REF field is not A/C/G/T
|
-Q Output the QCALL likelihood format
|
-s FILE List of samples to use. The first column in the input gives the sample names and the
second gives the ploidy, which can only be 1 or 2. When the 2nd column is absent, the
sample ploidy is assumed to be 2. In the output, the ordering of samples will be
identical to the one in FILE. [null]
|
-S The input is VCF instead of BCF.
|
-u Uncompressed BCF output (force -b).
|
-c Call variants using Bayesian inference. This option automatically invokes option -e.
|
-d FLOAT When -v is in use, skip loci where the fraction of samples covered by reads is below
FLOAT. [0]
|
-e Perform max-likelihood inference only, including estimating the site allele
frequency, testing Hardy-Weinberg equlibrium and testing associations with LRT.
|
-g Call per-sample genotypes at variant sites (force -c)
|
-i FLOAT Ratio of INDEL-to-SNP mutation rate [0.15]
|
-p FLOAT A site is considered to be a variant if P(ref|D)<FLOAT [0.5]
|
-P STR Prior or initial allele frequency spectrum. If STR can be full, cond2, flat or the
file consisting of error output from a previous variant calling run.
|
-t FLOAT Scaled muttion rate for variant calling [0.001]
|
-T STR Enable pair/trio calling. For trio calling, option -s is usually needed to be applied
to configure the trio members and their ordering. In the file supplied to the option
-s, the first sample must be the child, the second the father and the third the
mother. The valid values of STR are `pair', `trioauto', `trioxd' and `trioxs', where
`pair' calls differences between two input samples, and `trioxd' (`trioxs') specifies
that the input is from the X chromosome non-PAR regions and the child is a female
(male). [null]
|
-v Output variant sites only (force -c)
|
-1 INT Number of group-1 samples. This option is used for dividing the samples into two
groups for contrast SNP calling or association test. When this option is in use, the
following VCF INFO will be outputted: PC2, PCHI2 and QCHI2. [0]
|
-U INT Number of permutations for association test (effective only with -1) [0]
|
-X FLOAT Only perform permutations for P(chi^2)<FLOAT (effective only with -U) [0.01]
|