This repository contains the official implementation of QT-GILD: Quartet Based Gene Tree Imputation Using Deep Learning Improves Phylogenomic Analyses Despite Missing Data If you use any part of this software, please cite our paper.
QT-GILD is a quartet imputation technique for estimating species trees despite the presence of missing data.
QT-GILD is an automated and specially tailored unsupervised deep learning technique, accompanied by cues from natural languageprocessing (NLP), which learns the quartet distribution in a given set of incomplete gene trees andgenerates a complete set of quartets accordingly.
- Input: A set of incomplete gene trees
- Output: The imputed quartet distribution of the gene trees
Before installing QT-GILD, please sure that you have the following programs installed:
- Python: Version >= 3.7
- Pip: Version >= 21.0
- Java: Version >= 11.0 (if you want to generate the species trees using wQFM)
To install the python packages, use the following command
pip install -r requirements.txt
The authors recommend installing Anacoda and using seperate conda environment to install QT-GILD.
If you use wQFM, please cite the paper "wQFM: Highly Accurate Genome-scale Species Tree Estimation from Weighted Quartets".
python QT-GILD.py -i <input-gene-tree-file> -o <output-folder>
OR
python QT-GILD.py --input <input-gene-tree-file> --output <output-folder>
python QT-GILD.py -i <input-gene-tree-file> -o <output-folder> --st
There are two gene tree files provided in the repository to test QT-GILD
python QT-GILD.py --input test/aminota_gt.tre --output output