The implementation of Text GCN in our paper:
Liang Yao, Chengsheng Mao, Yuan Luo. "Graph Convolutional Networks for Text Classification." In 33rd AAAI Conference on Artificial Intelligence (AAAI-19), 7370-7377
Python 2.7 or 3.6
Tensorflow >= 1.4.0
-
Run
python remove_words.py 20ng
-
Run
python build_graph.py 20ng
-
Run
python train.py 20ng
-
Change
20ng
in above 3 command lines toR8
,R52
,ohsumed
andmr
when producing results for other datasets.
-
/data/20ng.txt
indicates document names, training/test split, document labels. Each line is for a document. -
/data/corpus/20ng.txt
contains raw text of each document, each line is for the corresponding line in/data/20ng.txt
-
prepare_data.py
is an example for preparing your own data, note that '\n' is removed in your documents or sentences.
An inductive version of Text GCN is fast_text_gcn, where test documents are not included in training process.