A Novel, Improved, Application for the Normalization of RNA-seq Expression Data in Complex Polyploids
AbstractMuch of the work on the normalization of RNA-seq data has been performed on human, notably cancer tissue. Little work has been done in plants, particularly polyploids and those species with incomplete or no genomes. We present a novel implementation of GeTMM (Gene Length Corrected TMM) that accounts for GC bias and works at the transcript level. The algorithm also employs transcript length as a factor, allowing for incomplete transcripts and alternate transcripts. This significantly improves overall normalization. The GCGeTMM methodology also allows for simultaneous determination of differentially expressed transcripts (and by extension genes) and stably expressed genes to act as references for qRT-PCR and microarray analyses.