- Table of Contents
- General Info
- Installation
- Documentation
- Input Data Formats
- Usage
- Issues
- Acknowledgements
- Disclaimer
CLEP is a framework that contains novel methods for generating patient representations from any patient level data and its corresponding prior knowledge encoded in a knowledge graph. The framework is depicted in the graphic below
NOTE: The installation of CLEP requires R to be installed on your system along with limma package for R. R can be downloaded from CRAN. The limma package can be installed in R with the following command:
if (!require("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install("limma")
The code can be installed from PyPI with:
$ pip install clep
The most recent code can be installed from the source on GitHub with:
$ pip install git https://github.com/hybrid-kg/clep.git
For developers, the repository can be cloned from GitHub and installed in editable mode with:
$ git clone https://github.com/hybrid-kg/clep.git
$ cd clep
$ pip install -e .
Read the official docs for more information.
Symbol | Sample_1 | Sample_2 | Sample_3 |
---|---|---|---|
HGNC_ID_1 | 0.354 | 2.568 | 1.564 |
HGNC_ID_2 | 1.255 | 1.232 | 0.26452 |
HGNC_ID_3 | 3.256 | 1.5 | 1.5462 |
Note: The data must be in a tab separated file format.
FileName | Target |
---|---|
Sample_1 | Abnormal |
Sample_2 | Abnormal |
Sample_3 | Control |
Note: The data must be in a tab separated file format.
The graph format CLEP can handle is a modified version of the Edge List Format. Which looks as follows:
Source | Relation | Target |
---|---|---|
HGNC_ID_1 | association | HGNC_ID_2 |
HGNC_ID_2 | decreases | HGNC_ID_3 |
HGNC_ID_3 | increases | HGNC_ID_1 |
Note: The data must be in a tab separated file format & if your knowledge graph does not have relations between the source and the target, just populate the relation column with "No Relation".
Note: These are very basic commands for clep, and the detailed options for each command can be found in the documentation
- Radical Searching The following command finds the extreme samples with extreme feature values based on the control population.
$ clep sample-scoring radical-search --data <DATA_FILE> --design <DESIGN_FILE> --control Control --threshold 2.5 --control_based --ret_summary --out <OUTPUT_DIR>
- Graph Generation The following command generates the patient-gene network based on the method chosen (Interaction_network).
$ clep embedding generate-network --data <SCORED_DATA_FILE> --method interaction_network --ret_summary --out <OUTPUT_DIR>
- Knowledge Graph Embedding
The following command generates the embedding of the network passed to it.
$ clep embedding kge --data <NETWORK_FILE> --design <DESIGN_FILE> --model_config <MODEL_CONFIG.json> --train_size 0.8 --validation_size 0.1 --out <OUTPUT_DIR>
- Classification
The following command carries out classification on the given data file for a chosen model (Elastic Net) using a chosen optimizer (Grid Search).
$ clep classify --data <EMBEDDING_FILE> --model elastic_net --optimizer grid_search --out <OUTPUT_DIR>
If you have difficulties using CLEP, please open an issue at our GitHub repository.
If you have found CLEP useful in your work, please consider citing:
CLEP: A Hybrid Data- and Knowledge- Driven Framework for Generating Patient Representations.
Bharadhwaj, V. S., Ali, M., Birkenbihl, C., Mubeen, S., Lehmann, J., Hofmann-Apitius, M., Hoyt, C. T., & Domingo-Fernandez, D. (2020).
Bioinformatics, btab340.
The CLEP logo and framework graphic was designed by Carina Steinborn.
CLEP is a scientific software that has been developed in an academic capacity, and thus comes with no warranty or guarantee of maintenance, support, or back-up of data.