faops
is a lightweight tool for operating sequences in the fasta format.
This tool can be regarded as a combination of faCount
, faSize
,
faFrag
, faRc
, faSomeRecords
, faFilter
and faSplit
from
UCSC Jim Kent's utilities.
Comparing to Kent's fa*
utilities, faops
is:
- much smaller (kilo vs mega bytes)
- easy to compile (only one external dependency)
- well tested
- contains only one executable file
- can operate gzipped (bgzipped) files
- and can be run under all major OSes (including Windows).
faops
is also inspired/influenced/stealing from
seqtk
and
ufasta
.
$ ./faops
Usage: faops <command> [options] <arguments>
Version: 0.8.21
Commands:
help print this message
count count base statistics in FA file(s)
size count total bases in FA file(s)
masked masked (or gaps) regions in FA file(s)
frag extract sub-sequences from a FA file
rc reverse complement a FA file
one extract one fa record
some extract some fa records
order extract some fa records by the given order
replace replace headers from a FA file
filter filter fa records
split-name splitting by sequence names
split-about splitting to chunks about specified size
n50 compute N50 and other statistics
dazz rename records for dazz_db
interleave interleave two PE files
region extract regions from a FA file
Options:
There're no global options.
Type "faops command-name" for detailed options of each command.
Options *MUST* be placed just after command.
-
Reverse complement
faops rc test/ufasta.fa out.fa # prepend RC_ to names faops rc -n test/ufasta.fa out.fa # keep original names
-
Extract sequences with names in
list.file
, one name per linefaops some test/ufasta.fa list.file out.fa
-
Same as above, but from stdin and to stdout
cat test/ufasta.fa | faops some stdin list.file stdout
-
Sort by header strings
faops order test/ufasta.fa \ <(cat test/ufasta.fa | grep '>' | sed 's/>//' | sort) \ out.fa
-
Sort by lengths
faops order test/ufasta.fa \ <(faops size test/ufasta.fa | sort -n -r -k2,2 | cut -f 1) \ out2.fa
-
Tidy fasta file to 80 characters of sequence per line
faops filter -l 80 test/ufasta.fa out.fa
-
All content written on one line
faops filter -l 0 test/ufasta.fa out.fa
-
Convert fastq to fasta
faops filter -l 0 in.fq out.fa
-
Compute N50, clean result
faops n50 -H test/ufasta.fa
-
Compute N75
faops n50 -N 75 test/ufasta.fa
-
Compute N90, sum and average of contigs with estimated genome size
faops n50 -N 90 -S -A -g 10000 test/ufasta.fa
faops
can be compiled under Linux, macOS (gcc or clang) and Windows
(MinGW).
git clone https://github.com/wang-q/faops
cd faops
make
brew install wang-q/tap/faops
Done with bats. Useful articles:
- https://blog.engineyard.com/2014/bats-test-command-line-tools
- http://blog.spike.cx/post/60548255435/testing-bash-scripts-with-bats
# brew install bats-core
make test
zlib
kseq.h
andkhash.h
fromklib
(bundled)
Qiang Wang <[email protected]>
This software is copyright (c) 2014 by Qiang Wang.
This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.