CAFU

CAFU is a Galaxy-based bioinformatics framework for comprehensive assembly and functional annotation of unmapped RNA-seq data from single- and mixed-species samples which integrates plenty of existing NGS analytical tools and our developed programs, and features an easy-to-use interface to manage, manipulate and most importantly, explore large-scale unmapped reads.
Besides the common process of reads cleansing, reads mapping, unmapped reads generation and novel transcription assembly, CAFU optionally offers the multiple-level evidence analysis of assembled transcripts, the sequence and expression characteristics of assembled transcripts, and the functional exploration of assembled transcripts through gene co-expression analysis and genome-wide association analysis.
Taking advantages of machine learning (ML) technologies, CAFU also effectively addresses the challenge of classifying species-specific transcripts assembled using unmapped reads from mixed-species samples.
The CAFU project is hosted on GitHub(https://github.com/cma2015/CAFU) and can be accessed from http://omicstudio.cloud:4001/. The CAFU Docker image is available at https://hub.docker.com/r/malab/cafu.

Overview of functional modules in CAFU

Extraction of unmapped reads
De novo transcript assembly of unmapped reads
Evidence support of assembled transcripts
Species assignment of assembled transcripts
Sequence characterization of assembled transcripts
Expression profiles of assembled transcripts
Function annotation of assembled transcripts

How to use CAFU

Tutorials for CAFU: https://github.com/cma2015/CAFU/blob/master/Tutorials/User_manual.md
Test datasets referred in the tutorials for CAFU: https://github.com/cma2015/CAFU/tree/master/Test_data

News and updates

CAFU updated on Jan 1, 2019

In the function Assemble Unmapped Reads, a parameter "Memory" was added for setting the maximum memory to be used by Triniry (1G in default).
To run the function Species Assignment of Transcripts, users can now use pre-trained or self-trained models. Currently, a pre-trained model was provided by training 20,502 and 137,052 mRNAs annotated in the reference genome of stripe rust pathogen Puccinia striiformis f. sp. tritici (PST-78 v1) and Chinese Spring wheat (IWGSC RefSeq v1.0), respectively.
The user tutorial was updated to highlight the importance of CPUs, Memory and Swap settings for running CAFU docker.

CAFU updated on Nov 30, 2018

A function Remove Contamination was added to remove potential contamination sequences using Deconseq (Schmieder et al., 2011).
A function Remove Batch Effect was added to remove batch effects using an R package sva (Leek et al., 2012).

CAFU released on Oct 13, 2018

CAFU source codes, web server and Docker image were released for the first time.

How to access help

For any bugs/issues, please feel free to leave a message at Github issues. We will try our best to deal with all issues as soon as possible.

How to cite this work

Siyuan Chen, Chengzhi Ren, Jingjing Zhai, Jiantao Yu, Xuyang Zhao, Zelong Li, Ting Zhang, Wenlong Ma, Zhaoxue Han, Chuang Ma. CAFU: a Galaxy framework for exploring unmapped RNA-Seq data. Briefings in Bioinformatics, 2020;21:676-686.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

CAFU

Overview of functional modules in CAFU

How to use CAFU

News and updates

CAFU updated on Jan 1, 2019

CAFU updated on Nov 30, 2018

CAFU released on Oct 13, 2018

How to access help

How to cite this work

Files

README.md

Latest commit

History

README.md

File metadata and controls

CAFU

Overview of functional modules in CAFU

How to use CAFU

News and updates

CAFU updated on Jan 1, 2019

CAFU updated on Nov 30, 2018

CAFU released on Oct 13, 2018

How to access help

How to cite this work