基于Snakemake的RNA-seq数据自动化分析流程RNApipe
CSTR:
作者:
作者单位:

1.华中农业大学信息学院,武汉430070;2.华中农业大学生命科学技术学院,武汉430070

作者简介:

武乐,E-mail:wuler005@163.com

中图分类号:

TP311.13

基金项目:

国家自然科学基金项目(31970552)


RNApipe: automated analyses of RNA-seq data based on Snakemake
Author:
Affiliation:

1.College of Informatics, Huazhong Agricultural University,Wuhan 430070,China;2.College of Life Science and Technology, Huazhong Agricultural University,Wuhan 430070,China

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • | |
  • 文章评论
    摘要:

    为使科研工作者简单高效地分析RNA-seq数据,本研究基于Snakemake工作流程管理系统和Conda环境管理器构建了一个自动化和模块化的工作流程:RNApipe(Github:https://github.com/ywu019/RNApipe.git),其可对来自任何有参物种的RNA-seq数据自动执行质控、比对、定量、鉴定差异基因,以及GO、KEGG、GSEA等功能注释分析;其中,每一步骤的分析结果均以高质量的可视化图片或报告展示,并保留重要的输出文件。使用RNApipe在多个模式物种中的测试与评估结果表明:RNApipe可以平稳运行,且注释结果准确。与现有的自动化分析流程相比,RNApipe的主要特点包括:工作流程较为完整、默认工具消耗时间与资源较少、适用于任何有参物种、全面的可视化、以及用户友好性(易安装、易使用、易扩展)。研究表明,RNApipe便于研究人员快速地从大型RNA-seq测序数据中获取基本信息。

    Abstract:

    Transcriptome sequencing technology (RNA-seq) has been widely used in the field of basic scientific studies, but the bioinformatics analysis of RNA-seq data places high requirements on the programming ability of researchers.In order to enable researchers to analyze RNA-seq data simply and efficiently, this article constructed an automated and modular workflow-RNApipe (On Github:https://github.com/ywu019/RNApipe.git) based on the Snakemake workflow management system and Conda environment manager.RNApipe can automatically conduct quality control, alignment, quantification, identification of differential genes, and functional annotation analyses including GO, KEGG, and GSEA with RNA-seq data from any species with a reference genome.The results of analysis in each step are presented in high-quality visualizations or reports, and important output files are preserved.RNApipe has been tested and evaluated in multiple model species.The results showed that RNApipe can run smoothly and the results of annotation are accurate.Compared with the existing pipelines of automated analysis, the main features of RNApipe include (i) the workflow is relatively complete, (ii) the default tools consume less time and resources, (iii) applicable to any parametric species, (iv) comprehensive visualization, and (v) user-friendliness (easy to install, use, and expand).The features of RNApipe mentioned above allow researchers to quickly obtain essential information from large-scale RNA-seq sequencing data.

    表 1 RNApipe与其他自动化分析工具的比较Table 1 Comparison of RNApipe with other automated analysis tools
    图1 RNApipe数据分析流程图Fig.1 The workflow chart of RNApipe analysis
    图2 RNApipe的文件与目录结构Fig.2 The file and directory structures of RNApipe
    图3 数据质量检测报告Fig.3 Report on data quality inspection
    图4 样本间基因表达量的可视化Fig.4 Visualization of gene expression levels among samples
    图5 样本间差异表达基因的可视化Fig.5 Visualization of differentially expressed genes among samples
    图6 样本间基因功能注释的可视化Fig.6 Visualization of gene functional annotations among samples
    参考文献
    相似文献
    引证文献
引用本文

武乐,李益楠,孔德信,周志鹏.基于Snakemake的RNA-seq数据自动化分析流程RNApipe[J].华中农业大学学报,2022,41(6):143-151

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2022-01-13
  • 在线发布日期: 2022-12-09
文章二维码