RNA-seq libraries and analysis.
Total RNA was extracted and qualified and sent to Berry Genomics (Beijing, China) for transcriptome sequencing. After sequencing, obtain a binary file in fastqc.gz format, then using Fastqc (version 0.11.7,https://www.bioinformatics.babraham.ac.uk/projects/fastqc/) software for raw data quality assessment. Trim-galore (version 0.6.6,https://www.bioinformatics.babraham.ac.uk/projects/trim_galore/, parameters: paired, –quality 25 –length 36, stringency 3) software was used to remove splices and low-quality reads to obtain clean reads with quality scores greater than Q20. Then, the human genome HG38 (https://hgdownload.soe.ucsc.edu/downloads.html#hg38sequence) and its annotation file (https://hgdownload.soe.ucsc.edu/downloads.html#hg38annotations) were downloaded from UCSC website (http://www.genome.ucsc.edu/), which was used to construct human genome index by HISAT2 software (hisat2, version 2.2.1,http://daehwankimlab.github.io/hisat2/ ; parameters: default parameters) and then aligned to the reference genome. Samtools software (version 1.9,https://sourceforge.net/projects/samtools/files/samtools/1.9/; parameters : -b -S -h) was used to transform Sam files into BAM files and sort them. After sorting, The featureCounts program in the subread package (version 2.0.1,http://subread.sourceforge.net/ ; parameters: -t exon -g gene_id,) was used for gene level quantification. The results were expressed by FPKM. After obtaining the count matrix, the R package DEseq2 (version 1.24.0,http://www.bioconductor.org/packages/release/bioc/html/DESeq2.html) based on the negative binomial distribution was used for the difference analysis.
The fold change (FC) thresholds for identifying DEGs were | Log2FC | >1, and a quality score greater than Q20, i.e., false discovery rate (FDR) less than 0.05.and visualized the results by R package ggplot2 [34].