Detecting isoform switching from RNA-Seq

Alternative splicing produces an array of transcripts from individual gene. Some of these alternative transcripts are expressed in tissue-specific fashion and may play different roles in the cell.

Recently, I got interested in identifying isoform switching from RNA-Seq data, this is detecting changes in the most expressed isoform between conditions. As I couldn’t find any ready solution, I have written my own script for this task (fpkm2major.py). This script simply parse isoforms.fpkm_tracking output from cufflinks and report genes with evidence of isoform switching into tab-delimited output file.

For each gene, number of transcripts, cumulative expression, major isoforms for each condition and each major isoform are reported. Lowly expressed genes in given condition are coded with `0` (--minFPKM < 1), while genes without clear major isoform are marked with `-1` (--frac 0.25).

[bash]
fpkm2major.py -v -i sample1/isoforms.fpkm_tracking -o major_isoforms.txt

# combine data from multiple samples
paste sample1/isoforms.fpkm_tracking <(cut -f10- sample2/isoforms.fpkm_tracking)
| fpkm2major.py -v -o major_isoforms2.txt

# visualise using cummeRbund
R
library(cummeRbund)
cuff <- readCufflinks("sample1")

# expression plot for isoforms of given gene
geneid="ENSDARG00000013615"
gene<-getGene(cuff, geneid)
expressionPlot(isoforms(gene), showErrorbars=T)
dev.copy(svg, paste(geneid,".sample1.svg"), width=12, height=12); dev.off()

[/bash]

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s