RNA-seq data analysis
For the RNA-seq datasets (GSE147507, GSE123938, GSE3963412), count data
was uploaded in online MeV software and normalised using the DESeq tool,
while for GSE150316, DESeq normalised data was already available.
Differential expression of genes between control and target groups was
analysed using the Limma pipeline. The output provided by the Limma
contained a list of statistically differentially expressed genes
(p<=0.05). This list of genes was uploaded in the online
Network Analyst software and the output containing the list of
significantly enriched pathways (p<=0.05) was downloaded (the
list of pathways for each dataset is given below), along with the list
of genes implicated in each pathway. Furthermore, for each of the
enriched pathways we carefully looked into the expression patterns of
each member genes and based upon the directionality of the key enzymes,
regulatory proteins, neighbouring genes and published studies, the
upregulation/downregulation of the respective pathway was deciphered.
For the GSA id PRJCA002326 dataset, since only raw read data was
available, Thefastq reads were mapped to hg38 using STAR (v2.27.2b) to
create the sample-wise bam files. The bam files were then processed
using Rsamtools, Rsubread and Genomic Alignments R packages to create
the count table. The count data was subsequently analysed using the MeV
software and NetworkAnalyst software as described above.