SWPS3 project page

Authors: Adam M Szalkowski, Christian Ledergerber, Philipp Krähenbühl, Christophe Dessimoz
Publication: reference here

Introduction:

swps3 is a vectorized implementation of the Smith-Waterman local alignment algorithm optimized for both the Cell/B.E. and x86 architectures. Our benchmarking results show that swps3 is currently the fastest implementation of a vectorized Smith-Waterman on the Cell/B.E.: on a Playstation 3, it achieves up to 8.0 GCUPS. On the SSE2 instruction set, swps3 improves over state-of-the-art implementations by natively exploiting multi-core architectures. Results on quad-core Intel Pentium show performances up to 15.7 GCUPS.

Downloads:

If you experience any problems downloading/executing swps3 please get in touch.

Example:

Downloading and building swps3 from source on Linux:

# download and extract source
wget http://lab.dessimoz.org/swps3/files/swps3-src-current.tar.bz2
tar xjvf swps3-src-current.tar.bz2

# delete downloaded archive and change into the source directory
rm swps3-src-current.tar.bz2
cd swps3-*

Setting up swps3:

# download and extract scoring matrices
wget http://lab.dessimoz.org/swps3/files/matrices.tar.bz2
tar xjvf matrices.tar.bz2

# download and extract test sequences
wget http://lab.dessimoz.org/swps3/files/test.tar.bz2
tar xjvf test.tar.bz2

Testing swps3 on x86/x86_64:

# perform an alignment:
./swps3 matrices/blosum50.mat test/query1000.fa test/db1000.fa

# to verify the results we can use the (slow) scalar version using the -s switch:
./swps3 -s matrices/blosum50.mat test/query1000.fa test/db1000.fa

# the number of worker threads can be overridden if the CPU
# features multiple cores (e.g. quad-core) and the automatic
# detection fails:
# ( -j 4: start four worker threads )
./swps3 -j 4 matrices/blosum50.mat test/query1000.fa test/db1000.fa

Testing swps3 on Cell/B.E.:

# perform an alignment using six SPUs (default setting):
./swps3 matrices/blosum50.mat test/query1000.fa test/db1000.fa

# now use additionally the AltiVec instruction set of the PPU:
./swps3 -j 7 matrices/blosum50.mat test/query1000.fa test/db1000.fa

# if you have a Cell processor with eight SPUs available:
./swps3 -j 9 matrices/blosum50.mat test/query1000.fa test/db1000.fa

Documentation:

Command line parameters:

Usage: swps3 [-h] [-s] [-j num] [-i num] [-e num] [-t num] matrix query db
matrix:scoring matrix file (see sample matrices)
query:query amino acid sequence in FASTA format
db:amino acid sequence data base in FASTA format
-h:print help
-s:run scalar version (without vectorized instructions)
-j num:start num worker threads (if parameter not specified try to find out number of processors via sysconf)
-i num:gap insertion score (default: 12)
-e num:gap extension score (default: 2)
-t num:score limit (default: DBL_MAX)

API documentation of libswps3:

Along with swps3 a library version named libswps3 is provided for application developers. Documentation can be found online.