ME-CNER

Code for CIKM 2019 paper "Exploiting Multiple Embeddings for Chinese Named Entity Recognition".

Citation

If you use this code in your work, please kindly cite our work:

@inproceedings{cikm19:xu,
  author    = {Canwen Xu and
               Feiyang Wang and
               Jialong Han and
               Chenliang Li},
  title     = {Exploiting Multiple Embeddings for Chinese Named Entity Recognition},
  booktitle = {The 28th ACM International Conference on Information and Knowledge Management, {CIKM} 2019, Beijing, China,
               November 3-7, 2019},
  publisher = {{ACM}},
  year      = {2019},
  url       = {https://doi.org/10.1145/3357384.3358117},
  doi       = {10.1145/3357384.3358117}
}

Requirement

Python: 3.6  
Keras: 2.2.2
Keras-contrib: 2.0.8
jieba: 0.39

Dataset

We use a standard Weibo NER dataset provided by Peng and Dredze, 2015, and a formal MSRA News dataset provided by Levow, 2006.

Pretrained Embeddings

The pretrained character and word embeddings are provided by Tencent AI Lab. Download it here.

The radical embedding is randomly initialized.

How to Run

Install all requirements

pip install keras==2.2.2  # for Keras
pip install git https://www.github.com/keras-team/keras-contrib.git  # for CRF layer
pip install jieba  # for word segmentation

Download pretrained embeddings Download Tencent Embeddings, extract it and put it in process_data/data_preprocess.
Run the pre-processing code

python concat_data.py

Run the model (with different config)

python main.py --dataset ${weibo/msra} --with_radical ${1/0} --network ${convgru/cnn/bilstm} 
--tagger ${bigrucrf/bilstmcrf} --entity_type ${all/nm/ne}

dataset:
  weibo
  msra
  
with_radical:  # input radical embedding or not
  0  # no radical embedding input, only word embedding and char embedding
  1  # with radical embedding
  
network:  # for characters
  convgru  # Conv-GRU  
  bilstm 
  cnn 
  
tagger:
  bigrucrf  # Bidirectional GRU-CRF 
  bilstmcrf  # Bidirectional LSTM-CRF 
  
entity_type:
  ne  # only Named Entity. e.g. 王小明 (Xiaoming Wang), 北京市 (Beijing City)
  nm  # only Nominal Mention. e.g. 班长 (class president), 妈妈 (mother) 
  all  # take both Named Entity and Nominal Mention into accounts

For example, run the following shell to run our final ME-CNER model on WEIBO dataset, but only recognize named entities (all nominal mentions are ignored).

python main.py --dataset weibo --with_radical 1 --network convgru --tagger bigrucrf --entity_type ne

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
process_character		process_character
train_ner		train_ner
README.md		README.md
concat_data.py		concat_data.py
entity_type.py		entity_type.py
main.py		main.py
train_status.py		train_status.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ME-CNER

Citation

Requirement

Dataset

Pretrained Embeddings

How to Run

About

Releases

Packages

Contributors 3

Languages

WHUIR/ME-CNER

Folders and files

Latest commit

History

Repository files navigation

ME-CNER

Citation

Requirement

Dataset

Pretrained Embeddings

How to Run

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages