SOLiD RNA-Seq & splice-aware mapping

I’ve lost quite a lot of time trying to align color-space RNA-Seq reads. SHRiMP paper explains nicely, why it’s important to align SOLiD reads in color-space, instead of converting color-space directly into sequence-space. Below, you can find the simplest solution I have found, using tophat, relying on bowtie mapper (bowtie2 doesn’t support color-space) and color-space reads in .csfasta.

[bash]
# generate genome index in color-space
bowtie-build –color GENOME.fa GENOME

# get SOLiD reads from SRA if you don’t have them already in .csfasta
abi-dump SRR062662

# tophat splice-aware mapping in color-space
mkdir tophat
ref=REFERENCE_DIR/GENOME
for f in READS_DIR/*.csfasta; do
s=`echo $f | cut -f2 -d’/’ | cut -f1 -d’.’`
if [ ! -d tophat/$s ]; then
echo `date` $f $s
tophat -p 4 –no-coverage-search –color -o tophat/$s –quals $ref $f READS_DIR/${s}_QV.qual
fi
done
[/bash]

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s