SOLiD RNA-Seq & splice-aware mapping

I’ve lost quite a lot of time trying to align color-space RNA-Seq reads. SHRiMP paper explains nicely, why it’s important to align SOLiD reads in color-space, instead of converting color-space directly into sequence-space. Below, you can find the simplest solution I have found, using tophat, relying on bowtie mapper (bowtie2 doesn’t support color-space) and color-space reads in .csfasta.

# generate genome index in color-space
bowtie-build –color GENOME.fa GENOME

# get SOLiD reads from SRA if you don’t have them already in .csfasta
abi-dump SRR062662

# tophat splice-aware mapping in color-space
mkdir tophat
for f in READS_DIR/*.csfasta; do
s=`echo $f | cut -f2 -d’/’ | cut -f1 -d’.’`
if [ ! -d tophat/$s ]; then
echo `date` $f $s
tophat -p 4 –no-coverage-search –color -o tophat/$s –quals $ref $f READS_DIR/${s}_QV.qual


